Ad Recommendation Systems for Life-Time Value Optimization

نویسندگان

  • Georgios Theocharous
  • Philip S. Thomas
  • Mohammad Ghavamzadeh
چکیده

The main objective in the ad recommendation problem is to find a strategy that, for each visitor of the website, selects the ad that has the highest probability of being clicked. This strategy could be computed using supervised learning or contextual bandit algorithms, which treat two visits of the same user as two separate independent visitors, and thus, optimize greedily for a single step into the future. Another approach would be to use reinforcement learning (RL) methods, which differentiate between two visits of the same user and two different visitors, and thus, optimizes for multiple steps into the future or the life-time value (LTV) of a customer. While greedy methods have been well-studied, the LTV approach is still in its infancy, mainly due to two fundamental challenges: how to compute a good LTV strategy and how to evaluate a solution using historical data to ensure its “safety” before deployment. In this paper, we tackle both of these challenges by proposing to use a family of off-policy evaluation techniques with statistical guarantees about the performance of a new strategy. We apply these methods to a real ad recommendation problem, both for evaluating the final performance and for optimizing the parameters of the RL algorithm. Our results show that our LTV optimization algorithm equipped with these off-policy evaluation techniques outperforms the greedy approaches. They also give fundamental insights on the difference between the click through rate (CTR) and LTV metrics for performance evaluation in the ad recommendation problem.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Personalized Ad Recommendation Systems for Life-Time Value Optimization with Guarantees

In this paper, we propose a framework for using reinforcement learning (RL) algorithms to learn good policies for personalized ad recommendation (PAR) systems. The RL algorithms take into account the long-term effect of an action, and thus, could be more suitable than myopic techniques like supervised learning and contextual bandit, for modern PAR systems in which the number of returning visito...

متن کامل

Automatic Representation for Life-Time Value Recommender Systems Automatic Representation for Life-Time Value Recommender Systems

Recommender systems are embedded in almost every commercial site, proposing users items which are likely to draw their interest. While most systems maximize the immediate gain, a better notion of success would be the lifetime value (LTV) of the user-system interaction. The LTV approach instead considers the future implications of the item recommendations, and seeks to maximize over the cumulati...

متن کامل

Energy Efficient Routing in Mobile Ad Hoc Networks by Using Honey Bee Mating Optimization

Mobile Ad hoc networks (MANETs) are composed of mobile stations communicating through wireless links, without any fixed backbone support. In these networks, limited power energy supply, and frequent topology changes caused by node mobility, makes their routing a challenging problem. TORA is one of the routing protocols that successfully copes with the nodes’ mobility side effects, but it do...

متن کامل

Economic optimization of solar systems in uncertain economic conditions using the Monte Carlo method

Solar energy is an environmentally sustainable energy source as it is clean and inexhaustible. Solar systems are very common and cost-effective, thus, can be used for many home applications. In this paper, a new method is presented to optimize solar systems economically, regarding to energy cost fluctuations. In spite of conventional analyses, in which the inflation is considered constant, ...

متن کامل

Energy Efficient Routing in Mobile Ad Hoc Networks by Using Honey Bee Mating Optimization

Mobile Ad hoc networks (MANETs) are composed of mobile stations communicating through wireless links, without any fixed backbone support. In these networks, limited power energy supply, and frequent topology changes caused by node mobility, makes their routing a challenging problem. TORA is one of the routing protocols that successfully copes with the nodes’ mobility side effects, but it do...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2015